Writing x86 assembly for 9 operating systems


DOS (raw; fasm/nasm)

DOS supports one of the most basic executable formats known to man. Up to 65,280 bytes of code and data are loaded into a shared segment, starting at address 0x100. cs, ds, es and ss all point to it when the program loads.

System calls are performed by loading parameters into the appropriate registers, then issuing interrupt 0x21. ah specifies the system call number; a value of 9 instructs DOS to print a dollar-terminated string at address ds:dx. Further values are described in Ralf Brown's Interrupt List.

There are several ways to terminate the program. One can perform a near return, issue interrupt 0x20 or invoke interrupt 0x21 with ah set to zero. The second approach happens to be the fastest.

org 0x100

hello:
    mov ah, 9
    mov dx, text
    int 0x21
    int 0x20

text db "Hello, World!$"

DOS (MZ; fasm/nasm)

The sheer simplicity of the raw format comes at a price. The 64 KiB limit imposes a needless restriction on most DOS systems, which have memory capacities measured in megabytes or gigabytes. The format also lacks support for relocation.

To get around these limitations, DOS introduced a second executable format. It can store up to 1 MiB of code and data, with relocatable values for the code and stack segment. The binary resides at address 0x100 in the data segment.

This time around, the program is terminated by issuing interrupt 0x21 with ah set to 0x4C. al holds the exit code.

format MZ
org 0x100

hello:
    mov ah, 9
    mov dx, text
    int 0x21
    mov ax, 0x4C00
    int 0x21

text db "Hello, World!$"

16-bit Windows (wasm)

16-bit Windows adds another executable format—the New Executable. Applications have their working set spread across multiple code and data segments. The Task Manager reserves the first 16 bytes of the first data segment; therefore, at least one such segment must be present.

Windows does not have a documented system call interface. Instead of invoking a single handler, Windows applications load a named function from a system library and call it. Said functions are declared using the extrn directive.

With functions comes the concept of a calling convention; while DOS passes and returns values in arbitrary registers, Windows follows a set of rules for input and output. Parameters are pushed onto the stack from left to right. Functions clean the stack and return a 16-bit value in ax, or a 32-bit one in dx:ax. The bx, cx, dx, es and floating-point registers can be freely modified by the function.

Win16 programs require a bit of initialisation code. First, one must call InitTask. This function stores the arguments to WinMain in several registers. The most important one is di, which holds a handle to the program instance. This value is immediately passed to InitApp to prepare the entry point. If either function returns zero, the program failed to load and must terminate. 32-bit Windows does not enforce this; when in doubt, Win16 applications should be tested on 16-bit Windows or OS/2.

Being a graphical environment, 16-bit Windows supports windowed applications, which communicate with the user by displaying dialog boxes. A simple dialog can be created using the MessageBox function. The first parameter is the handle of the parent window, which can be zero. The next three arguments consist of a title for the dialog box, the message it shall contain and a bit mask for alternative styles.

A highly-accurate API reference for 16-bit Windows is provided in the Windows 3.1 SDK. If it's not available, a list of Win32 API functions can be found on Microsoft's website [HTTPS]. The "Minimum supported client/server" field can be safely disregarded; many of the listed functions are already present on 16-bit Windows, save for ones with "Ex" in the name.

The Win16 API does not constitute a complete replacement for the DOS API. A number of crucial tasks, including file I/O and program termination, are still handled using DOS system calls.

extrn InitApp:proc
extrn InitTask:proc
extrn MessageBox:proc

.code

hello:
    call InitTask
    cmp ax, 1
    jb quit
    push di
    call InitApp
    cmp ax, 1
    jb quit
    xor bx, bx
    push bx
    push ds
    mov ax, offset _text
    push ax
    push ds
    mov ax, offset _title
    push ax
    push bx
    call MessageBox
quit:
    mov ah, 0x4C
    sbb al, al
    int 0x21

.data

db 16 dup (?)

_text  db "Hello, World!",0
_title db "Message",0

end hello

32-bit Windows (fasm)

Windows NT 3.1 and 95 adopted a fourth executable format that remains in use to this day. It is platform-independent and replaces segments with sections, which similarly designate blocks of data.

32-bit Windows programs can run in the console or GUI subsystem. Console applications support command-line input and output, yet have full access to the Windows API. When a console application loads, it either receives a Command Prompt window to draw on or uses an existing one.

The 32-bit Windows calling convention possesses a few key differences:

Most functions in the Win32 API are further divided into ASCII (A) and wide-character (W) variants. ASCII functions take strings in the default character encoding, which is usually Windows-1252. Wide-character functions handle a greater range of characters by working with UTF-16 strings. The latter category requires Windows NT or the Microsoft Layer for Unicode.

format PE GUI

include 'win32a.inc'

section '.text' executable

hello:
    push 0
    push title
    push text
    push 0
    call [MessageBoxA]
    push 0
    call ExitProcess

section '.data' readable
    text  db "Hello, World!",0
    title db "Message",0

section '.idata' import readable
    library kernel32, 'KERNEL32.DLL', user32, 'USER32.DLL'
    import kernel32, ExitProcess, 'ExitProcess'
    import user32, MessageBoxA, 'MessageBoxA'

64-bit Windows (fasm)

With the advent of 64-bit Windows, a few tweaks were made to improve function call performance. The first four parameters are now stored in rcx, rdx, r8 and r9, with further arguments passed on the stack. The caller allocates another 32 bytes of stack space to hold the four parameter registers. This can be done once for multiple calls, bypassing the need for several push/pop sequences.

The 64-bit API additionally expects a stack pointer aligned to a 16-byte boundary. This allows Windows to quickly store and initialise vector registers from the stack.

format PE64 GUI

include 'win64w.inc'

section '.text' executable

hello:
    and rsp, -15
    sub rsp, 32
    xor ecx, ecx
    mov rdx, title
    mov r8, text
    xor r9d, r9d
    call [MessageBoxW]
    xor ecx, ecx
    call [ExitProcess]

section '.data' readable
    text  dw 'H','e','l','l','o',',',' ','W','o','r','l','d','!',0
    title db 'M','e','s','s','a','g','e',0

section '.idata' import readable
    library kernel32, 'KERNEL32.DLL', user32, 'USER32.DLL'
    import kernel32, ExitProcess, 'ExitProcess'
    import user32, MessageBoxW, 'MessageBoxW'

32-bit Linux (fasm)

Linux originally used the a.out executable format, inherited from Unix. Difficulties in supporting dynamic linking drove the kernel to implement the Executable and Linkable Format, introduced with System V Release 4. Support for a.out executables was dropped with Linux 5.18, released in 2021.

Similarly to DOS, Linux provides a unified, register-based system call interface. eax contains the system call number, while ebx, ecx and edx hold the first three parameters (if any).

Linux implements console output as an abstraction on top of file I/O. A value of 4 in eax corresponds to the write system call, which writes binary data to a file. The first parameter specifies the target file descriptor, set to 1 here to indicate standard output. A pointer to a block of data comes next, followed by its size as the last parameter.

The program is terminated using the exit system call. It takes an exit code as its only argument.

A full list of Linux system calls for IA-32 can be found here [HTTPS].

format ELF executable

segment ".text" readable executable

hello:
    mov eax, 4
    mov ebx, 1
    mov ecx, text
    mov edx, 13
    int 0x80
    mov eax, 1
    xor ebx, ebx
    int 0x80

text db "Hello, World!"

64-bit Linux (fasm)

While 32-bit Linux used a similar calling convention to 32-bit Windows, the former switched to the System V AMD64 ABI. rax still holds the system call number; however, the first six parameters are now stored in rdi, rsi, rdx, rcx, r8 and r9, respectively. Besides the parameter registers, functions can freely modify r10, r11, the floating-point registers and the SSE/AVX registers.

64-bit Linux supports the same set of system calls as its predecessor, albeit with the numbers remapped. The write system call is now mapped to the number 1, while a value of 60 corresponds to the exit system call. Linux distributions commonly include a header named syscall.h, which contains a list of 64-bit Linux system calls; it usually resides in /usr/include/sys.

System calls can still be made using int 0x80; however, this invokes the 32-bit system call interface, with untold side effects. 64-bit system calls are performed using the syscall instruction. The latter approach provides a significant performance boost, since the new instruction skips several bounds and privilege checks that do not apply in long mode.

format ELF64 executable

segment readable executable

hello:
    mov eax, 1
    mov edi, 1
    mov rsi, text
    mov edx, 13
    syscall
    mov eax, 60
    xor edi, edi
    syscall

text db "Hello, World!"

64-bit FreeBSD (as)

FreeBSD combines the 32-bit Linux system call mapping with the calling convention from its 64-bit counterpart.

On BSD, one must specify the program's entry point by defining a global symbol named _start.

.section .text
.globl _start

_start:
    mov $4, %eax
    mov $1, %edi
    mov $text, %rsi
    mov $13, %edx
    syscall
    mov $1, %eax
    xor %edi, %edi
    syscall 

text: .ascii "Hello, World!"

64-bit NetBSD (as)

In addition to the usual code and data sections, NetBSD requires a .note.netbsd.ident section that identifies the operating system the executable is meant for. 7 is the length of the operating system name, including the null terminator, while 4 is the size of the system's version number in bytes.
.section .text
.globl _start

_start:
    mov $4, %eax
    mov $1, %edi
    mov $text, %rsi
    mov $13, %edx
    syscall
    mov $1, %eax
    xor %edi, %edi
    syscall 

text: .ascii "Hello, World!"

.section .note.netbsd.ident, "a"
    .align 2
    .long 7, 4, 1
    .asciz "NetBSD\0"
    .long 0

64-bit OpenBSD (as)

The OpenBSD project strives for security and readily breaks binary compatibility to achieve this goal. From version 7.3 onwards, programs can no longer read their own code section; this necessitates the creation of a data section to hold the output string.

Starting with OpenBSD 7.5, system calls can only be performed in the presence of an .openbsd.syscalls section. For a system call to succeed, its number and the address where it occurs must be recorded in a table within said section. If it's not present, the program terminates with a bus error.

This program must be built with the -no-pie linker parameter; otherwise, the linker will fail to generate the instruction offsets.

.section .text
.globl _start 

_start:
    mov $4, %eax
    mov $1, %edi
    mov $text, %rsi
    mov $13, %edx
write:
    syscall
    mov $1, %eax
    xor %edi, %edi
quit:
    syscall 

.section .data

text: .ascii "Hello, World!"

.section .note.openbsd.ident, "a"
    .align 2
    .long 8, 4, 1
    .asciz "OpenBSD"
    .long 0

.section .openbsd.syscalls
    .long write, 4
    .long quit, 1

Last edited on June 12, 2026.
Index